Automated Essay Scoring For Nonnative English Speakers

نویسندگان

  • Jill Burstein
  • Martin Chodorow
چکیده

The e-rater system 1 is an operational automated essay scoring system, developed at Educational Testing Service (ETS). The average agreement between human readers, and between independent human readers and e-rater is approximately 92%. There is much interest in the larger writing community in examining the system’s performance on nonnative speaker essays. This paper focuses on results of a study that show e-rater’s performance on Test of Written English (TWE) essay responses written by nonnative English speakers whose native language is Chinese, Arabic, or Spanish. In addition, one small sample of the data is from US-born English speakers, and another is from non-US-born candidates who report that their native language is English. As expected, significant differences were found among the scores of the English groups and the nonnative speakers. While there were also differences between e-rater and the human readers for the various language groups, the average agreement rate was as high as operational agreement. At least four of the five features that are included in e-rater’s current operational models (including discourse, topical, and syntactic features) also appear in the TWE models. This suggests that the features generalize well over a wide range of linguistic variation, as e-rater was not 1 The e-rater system is a trademark of Educational Testing Service. In the paper, we will refer to the e-rater system as e-rater. confounded by non-standard English syntactic structures or stylistic discourse structures which one might expect to be a problem for a system designed to evaluate native speaker writing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Automated Essay Scoring for Nonnative English Speakers

Automated Essay Scoring (AES) has been quite popular and is being widely used. However, lack of appropriate methodology for rating nonnative English speakers’ essays has meant a lopsided advancement in this field. In this paper, we report initial results of our experiments with nonnative AES that learns from manual evaluation of nonnative essays. For this purpose, we conducted an exercise in wh...

متن کامل

Automated Essay Scoring for Nonnative English Speakers

The e-rater system 1 , an automated essay scoring system, developed at Educational Testing Service (ETS), is currently being used to score essay responses on the Graduate Management Admissions Test (GMAT). The average agreement between human readers, and between independent human readers and e-rater is approximately 92%. There is much interest in the larger writing community in examining the sy...

متن کامل

Towards Using Conversations with Spoken Dialogue Systems in the Automated Assessment of Non-Native Speakers of English

Existing speaking tests only require nonnative speakers to engage in dialogue when the assessment is done by humans. This paper examines the viability of using off-the-shelf systems for spoken dialogue and for speech grading to automate the holistic scoring of the conversational speech of non-native speakers of English.

متن کامل

Non-English Response Detection Method for Automated Proficiency Scoring System

This paper presents a method for identifying non-English speech, with the aim of supporting an automated speech proficiency scoring system for non-native speakers. The method uses a popular technique from the language identification domain, a single phone recognizer followed by multiple languagedependent language models. This method determines the language of a speech sample based on the phonot...

متن کامل

Sentence Processing Among Native vs. Nonnative Speakers: Implications for Critical Period Hypothesis

The present study intended to investigate the processing behavior of 2 groups of L2 learners of English (high and mid in proficiency) and a group of English native speakers on English active and passive reduced relative clauses. Three sets of tasks, an offline task, and 2 online tasks were conducted. Results revealed that the high-proficiency group’s performance was the same as that of the nati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999